Use of SVD-based probit transformation in clustering gene expression profiles

نویسنده

  • Faming Liang
چکیده

The mixture-Gaussian model-based clustering method has received much attention in clustering gene expression profiles in the literature of bioinformatics. However, this method suffers from two difficulties in applications. The first one is on the parameter estimation, which becomes difficult when the dimension of the data is high or the size of a cluster is small. The second one is on the normality assumption for gene expression levels, which is seldom satisfied by real data. In this paper, we propose to overcome these two difficulties by the probit transformation in conjunction with the singular value decomposition (SVD). SVD reduces the dimensionality of the data, and the probit transformation converts the scaled eigensamples, which can be interpreted as correlation coefficients as explained in the text, into Gaussian random variables. Our numerical results show that the SVD-based probit transformation enhances the ability of themixture-Gaussianmodel-based clusteringmethod for identifying prominent patterns of the data. As a by-product, we show that the SVD-based probit transformation also improves the performance of the model-free clustering methods, such as hierarchical,K-means and self-organizing maps (SOM), for the data sets containing scattered genes. In this paper, we also propose a run test-based rule for selection of eigensamples used for clustering. © 2007 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A robust wavelet based profile monitoring and change point detection using S-estimator and clustering

Some quality characteristics are well defined when treated as response variables and are related to some independent variables. This relationship is called a profile. Parametric models, such as linear models, may be used to model profiles. However, in practical applications due to the complexity of many processes it is not usually possible to model a process using parametric models.In these cas...

متن کامل

Application of Singular Value Decomposition and Functional Clustering to Analyzing Gene Expression Profiles of Renal Cell Carcinoma

Microarray gene expression profiles of both renal cell carcinoma (RCC) tissues and a RCC cell line were analyzed using singular value decomposition (SVD) and functional clustering. The SVD projections of the expression profiles revealed significant differences between the profiles of RCC tissues and a RCC cell line. Based on the biological processes, selected genes were annotated and clustered ...

متن کامل

Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members

Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...

متن کامل

Transient expression of green fluorescent protein in radish (Raphanus sativus) using a turnip mosaic virus based vector

It is possible to use transgenic plants, as bioreactors, for the production of recombinant inexpensive chemicals and drug components. Transient gene expression is an appropriate alternative to stable transformation because it allows an inexpensive and rapid method for expression of recombinant proteins in plant tissues. In recent years, plant viral vectors have been increasingly developed as su...

متن کامل

Multivariate Feature Extraction for Prediction of Future Gene Expression Profile

Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2007